Conducting Vessel Data Imputation Method Selection Based on Dataset Characteristics
نویسندگان
چکیده
Abstract Time series datasets collected from marine sensors inevitably undergo missing data problems. This cause unreliable sensor to assist the decision-making process. Many methods are offered impute values. However, selecting best imputation method is not a trivial task, as it usually requires domain expertise and several trial-and-error iterations. Furthermore, when imputations carried out in careless way, generates high error factor that can lead stakeholders wrong assumptions. paper provides systematic approach able extract characteristics of underlying and, based on it, recommends less error-prone method. We evaluate our proposed using nine real-world vessel datasets. In total, we generated 3859 samples consisting 17 inputs 1 target feature. Experimental results show capable obtaining weighted F1-Score 92.6%. Additionally, compared with application selected methods, work gain up 86% average score, worst case being 5%. empirically demonstrate efficient methods.
منابع مشابه
Machine Learning Based Missing Value Imputation Method for Clinical Dataset
Missing value imputation is one of the biggest tasks of data pre-processing when performing data mining. Most medical datasets are usually incomplete. Simply removing the cases from the original datasets can bring more problems than solutions. A suitable method for missing value imputation can help to produce good quality datasets for better analysing clinical trials. In this paper we explore t...
متن کاملTraffic Speed Data Imputation Method Based on Tensor Completion
Traffic speed data plays a key role in Intelligent Transportation Systems (ITS); however, missing traffic data would affect the performance of ITS as well as Advanced Traveler Information Systems (ATIS). In this paper, we handle this issue by a novel tensor-based imputation approach. Specifically, tensor pattern is adopted for modeling traffic speed data and then High accurate Low Rank Tensor C...
متن کاملMissing Value Imputation Based on Data Clustering
We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clustering-based Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that are generated from the data in the instances which do not contain missing values and are most similar to t...
متن کاملRobust Tree-Based Incremental Imputation Method for Data Fusion
Data Fusion and Data Grafting are concerned with combining files and information coming from different sources. The problem is not to extract data from a single database, but to merge information collected from different sample surveys. The typical data fusion situation formed of two data samples, the former made up of a complete data matrix X relative to a first survey, and the latter Y which ...
متن کاملTumor Gene Characteristics Selection Method Based on Multi-Agent
For the tumor gene expression profile data that aiming to high-dimension small samples, how to select the classification feature of samples among thousands genes effectively is the difficult problems for analysis on tumor gene expression profile. First to partition the data set into K average divisions, to use Lasso method performing feature selection on each respectively, and then merge each s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IOP conference series
سال: 2023
ISSN: ['1757-899X', '1757-8981']
DOI: https://doi.org/10.1088/1755-1315/1198/1/012017